Back to top

Image: Bigstock

Will SOUN's Focus on Multimodal AI Differentiate It From Rivals?

Read MoreHide Full Article

Key Takeaways

  • SOUN's Polaris model combines voice and vision to boost speed, accuracy and use cases.
  • Revenues jumped 217% to $42.7M in Q2, with guidance raised to $160-$178M for 2025.
  • SOUN faces rivals like Amazon and Google but leans on specialization and 20 years of data.

SoundHound AI, Inc. ((SOUN - Free Report) ) is doubling down on multimodal AI as its key differentiator in the competitive conversational AI space. The company’s latest foundation model, Polaris, blends voice and vision capabilities to deliver real-time understanding across multiple inputs. Management argues that this approach not only improves accuracy and speed but also expands the range of use cases across industries, such as automotive, restaurants and enterprise services.

The second quarter underscored this momentum. Revenues surged 217% year over year to $42.7 million, beating expectations, with demand rising across verticals. CEO Keyvan Mohajer highlighted how integrating vision AI into the stack “marks a transformational shift” in human-computer interaction, enabling more natural and intuitive engagement with devices. Customers migrating to Polaris are reportedly seeing immediate improvements, which have driven renewals, upselling and stronger close rates.

Financially, SoundHound remains unprofitable, posting a non-GAAP net loss of $11.9 million, or 3 cents per share. Still, its robust pipeline, expanding partnerships and guidance raise to $160-$178 million in 2025 revenues indicate confidence in scaling the business.

The challenge lies in execution. Larger rivals like Amazon ((AMZN - Free Report) ) and Alphabet’s ((GOOGL - Free Report) ) Google possess deep resources and entrenched ecosystems. However, SoundHound’s specialization in multimodal AI, coupled with 20 years of proprietary data and a growing roster of global automotive and QSR clients, positions it to compete on quality rather than scale.

Ultimately, if adoption of multimodal AI accelerates, SoundHound’s early lead could give it a sustainable edge. However, investors will need to watch closely whether this differentiation translates into lasting profitability.

How Does SoundHound Stack Up Against Big Tech Rivals?

While SoundHound AI is leaning on multimodal innovation to stand out, it faces stiff competition from established players. Amazon, through Alexa and its broader cloud ecosystem, remains a dominant force in voice-enabled AI. Its advantage lies in scale, distribution and integration into smart homes and consumer devices. However, Amazon’s solutions have been slower to adopt multimodal capabilities at the level SoundHound claims with Polaris, leaving room for niche specialists.

Google is another formidable rival, with Google Assistant and cutting-edge AI research under its DeepMind and Gemini banners. The company has the infrastructure and data advantage to rapidly expand multimodal features across devices and enterprise services. Yet, Google’s focus is spread across multiple AI initiatives, potentially diluting attention to specific verticals like automotive or restaurants.

For now, SoundHound’s differentiation comes from deep domain expertise and agility. But the long-term test will be whether specialization can consistently outpace Big Tech’s scale.


See More Zacks Research for These Tickers


Normally $25 each - click below to receive one report FREE:


Amazon.com, Inc. (AMZN) - free report >>

Alphabet Inc. (GOOGL) - free report >>

SoundHound AI, Inc. (SOUN) - free report >>

Published in